Supersense Tagger for Italian

نویسندگان

  • Davide Picca
  • Alfio Massimiliano Gliozzo
  • Massimiliano Ciaramita
چکیده

In this paper we present the procedure we followed to develop the Italian Super Sense Tagger. In particular, we adapted the English SuperSense Tagger to the Italian Language by exploiting a parallel sense labeled corpus for training. As for English, the Italian tagger uses a fixed set of 26 semantic labels, called supersenses, achieving a slightly lower accuracy due to the lower quality of the Italian training data. Both taggers accomplish the same task of identifying entities and concepts belonging to a common set of ontological types. This parallelism allows us to define effective methodologies for a broad range of cross-language knowledge acquisition tasks

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Resource and Tool for Super-sense Tagging of Italian Texts

A SuperSense Tagger is a tool for the automatic analysis of texts that associates to each noun, verb, adjective and adverb a semantic category within a general taxonomy. The developed tagger, based on a statistical model (Maximum Entropy), required the creation of an Italian annotated corpus, to be used as a training set, and the improvement of various existing tools. The obtained results signi...

متن کامل

SuperSense Tagging with a Maximum Entropy Markov Model

We tackled the task of SuperSense tagging by means of the Tanl Tagger, a generic, flexible and customizable sequence labeler, developed as part of the Tanl linguistic pipeline. The tagger can be configured to use different classifiers and to extract features according to feature templates expressed through patterns, so that it can be adapted to different tagging tasks, including PoS and Named E...

متن کامل

Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger

This paper presents a novel approach to broad-coverage word sense disambiguation and information extraction. The task consists of annotating text with the tagset defined by the 41 Wordnet supersense classes for nouns and verbs. Since the tagset is directly related to Wordnet synsets, the tagger returns partial word sense disambiguation. Furthermore, since the noun tags include the standard name...

متن کامل

Supersense Tagging of Unknown Nouns Using Semantic Similarity

The limited coverage of lexical-semantic resources is a significant problem for NLP systems which can be alleviated by automatically classifying the unknown words. Supersense tagging assigns unknown nouns one of 26 broad semantic categories used by lexicographers to organise their manual insertion into WORDNET. Ciaramita and Johnson (2003) present a tagger which uses synonym set glosses as anno...

متن کامل

Automatic identification of semantic relations in Italian complex nominals

This paper addresses the problem of the identification of the semantic relations in Italian complex nominals (CNs) of the type N+P+N. We exploit the fact that the semantic relation, which is underspecified in most cases, is partially made explicit by the preposition. We develop an annotation framework around five different semantic relations, which we use to create a corpus of 1700 Italian CNs,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008